Skip to content

Conversation

eilmiv
Copy link
Contributor

@eilmiv eilmiv commented Sep 9, 2025

Summary of changes

  • Implement OAI-PMH 2.0 endpoint under /oai-pmh that allows harvesting of visible training materials in Dublin Core and RDF (Bioschemas) metadata standard
  • The OAI-PMH XML is visualized in the browser using XSLT to get an interactive clickable UI (which is commonly used with OAI-PMH)

Motivation and context

This is a relevant step in the mTeSS-X project.

Screenshots

image

Checklist

  • I have read and followed the CONTRIBUTING guide.
  • I confirm that I have the authority necessary to make this contribution on behalf of its copyright owner and agree
    to license it to the TeSS codebase under the
    BSD license.

@eilmiv
Copy link
Contributor Author

eilmiv commented Sep 9, 2025

Remaining TODOs for this pull request:

  • Remove accidentally committed PaN-Services code
  • Improve comments and layout of Dublin Core transformation code
  • Add note about modifications made to oai2xhtml.xsl

@eilmiv eilmiv marked this pull request as ready for review September 10, 2025 08:24
@eilmiv eilmiv marked this pull request as draft September 10, 2025 13:40
@eilmiv eilmiv marked this pull request as ready for review September 10, 2025 14:15
Comment on lines 21 to 28
class PublicMaterial < Material
default_scope { where(visible: true) }

# Pretend to be a regular Material (for URLs in RDF serialization)
def self.model_name
Material.model_name
end
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

According to the docs, we can pass a scoped relation to OAI::Provider::ActiveRecordWrapper.new instead of needing to make a new class:
https://github.com/code4lib/ruby-oai/blob/master/lib/oai/provider.rb#L249-L253

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. Changed in 7c2a9e7.

Comment on lines 206 to 250
# Dublin Core mappings for OAI-PMH
# no mapping needed for contributor, description and title
# coverage and source not mappable
alias_attribute :creators, :authors

def dates
[date_published, date_created, date_modified].compact.map(&:iso8601)
end

def format = 'text/html'

def identifier
if !doi.nil? && !doi.empty?
doi_iri = doi.start_with?('http://', 'https://') ? doi : "https://doi.org/#{doi}"
else
url
end
end

def language = 'en'

def publishers
if content_provider
[content_provider.title]
else
[]
end
end

# currently only url of tess resource, content provider url
def relations
[
"#{TeSS::Config.base_url}#{Rails.application.routes.url_helpers.material_path(self)}"
] + (content_provider ? [content_provider.url] : [])
end

alias_attribute :rights, :licence

def subjects
keywords + scientific_topics.map(&:uri) + operations.map(&:uri)
end

def types
['http://purl.org/dc/dcmitype/Text', 'https://schema.org/LearningResource'] + resource_type
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it would be cleaner if we could move these methods into another class if that's possible

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to one method in 8b54db2.

@eilmiv eilmiv marked this pull request as draft September 12, 2025 07:12
@eilmiv eilmiv marked this pull request as ready for review September 30, 2025 10:50
Copy link
Member

@fbacall fbacall left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

1 small review comment and a general question:

How could we extend this to also include events? Would it need a new OAI "provider"?

Comment on lines +194 to +204
def to_rdf
jsonld_str = to_bioschemas[0].to_json

graph = RDF::Graph.new
JSON::LD::Reader.new(jsonld_str) do |reader|
reader.each_statement { |stmt| graph << stmt }
end

rdfxml_str = graph.dump(:rdfxml, prefixes: { sdo: 'http://schema.org/', dc: 'http://purl.org/dc/terms/' })
rdfxml_str.sub(/\A<\?xml.*?\?>\s*/, '') # remove XML declaration because this is used inside OAI-PMH response
end
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this used anywhere?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants